Decennial Census
American Community Survey (ACS)
4/1/2022
Welcome! While we’re waiting:
Navigate to the workshop webpage: https://github.com/dlab-berkeley/Census-Data-in-R
Scroll down and read the Readme section.
Clone or download the workshop files by clicking on the green CODE button.
If you download the zipfile, unzip it.
Make a note of the folder in which the workshop files reside.
About me
About you
Brief overview of the primary US Census data products
Introduce R packages for working with census data
Use those packages to fetch census data
Use those packages to fetch census data plus census geographic boundary files
Make maps of census data
The “nation’s leading provider of quality data about its people and economy.”
Decennial Census
American Community Survey (ACS)
Complete count of the population every 10 years since 1790
A snapshot of the American population in time, with an April 1 reference date.
Includes data on
Population: by sex, age, race/ethnicity, and family / household relationships
Housing: by occupancy (occupied, vacant), tenure (owned, rented), and group quarters
From 1840 to 2000, additional questions were asked of a sample of the population.
Since 2005, the American Community Survey (ACS) has replaced the decennial census sample data questions.
Annual survey of a sample of about 3.5 million households released for 1, 3 or 5 year period.
Provides period estimates of demographic, social, economic, and housing characteristics
Includes margin of error values for the estimates
ACS 1-year and 5-year estimates are currently available through 2020
2020 data was just released!!ACS 3-year no longer available (2008—2013)
The ACS 1 year estimates include data from a sample of the population collected over a one year period.
Five years of data are pooled together, weighted and processed as a whole dataset to create the ACS 5 year estimates.
Use the ACS 1 year estimates when you want the most current data and are less concerned about precision (larger margins of error). However, the ACS 1 year estimates are only available for areas with large populations (+65,000) and for a subset of data tables.
Use the ACS 5 year estimates when you want more stability in the estimates, more data tables, and smaller geographic tabulation units. But can be tricky to interpret the data if the five year period is not stable (e.g., covid and 2016-2022 ACS 5yr.)
| Demographic* | Social | Economic | Housing |
|---|---|---|---|
| Sex | Families | Income | Tenure* |
| Age | Education | Benefits | Occupancy* |
| Race | Marital Status | Employment Status | Group quarters* |
| Hispanic Origin | Fertility | Occupation | Housing Value |
| Relationships | Grandparents | Industry | Taxes & Insurance |
| Veterans | Commuting | Utilities | |
| Disability Status | Place of Work | Mortgage | |
| Language at Home | Health Insurance | Monthly Rent | |
| Citizenship | Structure Type | ||
| *decennial census | Mobility |
Census data is collected from individuals. The individual-level response data is called microdata.
For privacy reasons, only a very limited subset of census microdata is publicly available as the Public Use Microdata Samples (PUMS) data.
Most census data is made publicly available only when aggregated to a geographic tabulation unit.
Not all census data is available for all geographic tabulation units. For example, only decennial census data are available at the block level.
Identify your
Topic of interest, e.g., population by age, income, monthly rents, etc…Dataset: Decennial Census or ACS 1-yr or ACS 5-yr?Year(s): for what time period?Geographic tabulation unit of aggregation (county, tract, etc.)Geographic filter by state(s) or countiesThen determine what specific census variables are available for your topic.
“If you want to measure change you can’t change the measures!”
Census tables, variables, geographies, and geographic boundaries change over time!
Measuring change over time with census data is its own thing, complex, and not covered by this workshop!
Here are three of the primary websites from which you can directly download census data:
You can download Census geographic data directly on the Census website.
You can write code to fetch data from the Census Web APIs
API: application programming interface
Web API: URLs can be formatted to make queries that return data
Or you can leverage an existing R package to make this easier!
Only a subset of recent Census data products are available via APIs.
These are the ones we recommend and will use today.
An R package with functions that make it easier to fetch decennial census and ACS data from the Census APIs.
Only a limited set of Census data available via tidycensus
Decennial census: 1990, 2000, and 2010
ACS 1 yr: 2005 through 2019
ACS 5 yr: 2005—2009 through 2015—2019 are available.
Actively maintained and expanding to include more census data products (see tidycensus website)
Developed by Kyle Walker to make it easier to fetch data from Census APIs in R in a tidy format to analyze, plot, and map.
Check out his website(https://walker-data.com/) to keep abreast of his great packages, blog posts, and tutorials.
And his new ebook Analyzing the US Census with R, currently available to read online.
The tidyverse package is an umbrella package that installs all the core tidyverse packages and makes them easier to manage and load in R, including:
ggplot2, for data visualizationdplyr, for data manipulationtidyr, for data tidyingreadr, for data importpurrr, for functional programmingtibble, for tibbles, a modern re-imagining of data framesstringr, for stringsforcats, for factorsSimple features for geospatial data objects and methods.
vector: locations represented as points, lines and polygonssf is loaded and used automatically by tidycensus.
The online book Geocomputation with R is a great resource for learning about the sf package and working with geospatial data in R.
mapview provides functions to quickly and easily create interactive maps for data exploration.
Before you can fetch data from the Census APIs, you must have a free Census API Key
Request one now if you don’t have one yet!
Clone or downloaded and unzip the workshop files from: https://github.com/dlab-berkeley/Census-Data-in-R
Then:
Open the folder with the workshop files
Double-click on the R Project file Census-Data-in-R.Rproj
This should open RStudio - with the Files panel displaying the workshop folder contents.
Double-click on the file Census-Data-in-R.Rmd in the Lessons folder to follow along!
Census-Data-in-R.html to follow along in a web brower.